Goto

Collaborating Authors

 mean 0





LearningtoOrientSurfaces bySelf-supervisedSphericalCNNs (SupplementaryMaterial)

Neural Information Processing Systems

Results for 3DMatch are shown in Table 1: the performance gain achieved by Compass when deploying theproposed data augmentation validates itsimportance. Indeed, without theproposed augmentation FLARE performs better than Compass on this dataset. This dataset has been specifically proposed to verify the invariance to rotations of the learned 3D descriptors [1], and containsonlyatestsplit. In Figure 2, we consider two pairs of local surface patches and their corresponding feature maps: both patches forming a pair are extracted around the same keypoint on different fragments. The canonical pose computed for the first pair is repeatable, while the second pair represents a failure ofCompass.


05311655a15b75fab86956663e1819cd-Supplemental.pdf

Neural Information Processing Systems

In what follows we will call each experiment by its corresponding figure or table number for convenience. For the rotated/shifted MNIST images (Figure 8, 9), we use the Affine transformation function in the TorchVisionlibrary. In experiments (Table 2, 3, 4, 5), we use either or both of the Large (L) and Small (S) dataset for the standard benchmark vision data: MNIST, FMNIST, KMNIST, Omniglot, SVHN, CIFAR10, CIFAR100, CELEBA. For Figure 10, Table 3, the regularization coefficients for CAE, WAE are searched around 0.01 0.001, the noise level used in DAE is searched around0.1 0.01, and the regularization coefficient andλforSPAEandNRAE aresearched around0.001 Ontheother hand, the runtimes of our algorithms are comparable with other existing methods.


Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries

Xu, Lihan, Dong, Yanjie, Wang, Gang, Zeng, Runhao, Fan, Xiaoyi, Hu, Xiping

arXiv.org Artificial Intelligence

Abstract--We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. T o simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies. As a promising paradigm for privacy-preserving distributed learning, federated learning (FL) leverages the parallel computational capabilities of user terminals to learn from decentralized data with the orchestration of a central server. Since its inception [1], [2], FL has been proliferating across diverse application scenarios, e.g., healthcare [3], [4], mobile edge [5], [6], and autonomous driving [7], [8]. Despite the merits in preserving user privacy, vanilla FL paradigm is still facing two major challenges, namely, Byzantine resilience [9], [10] and communication efficiency [11]. To robustify the FL paradigm, Byzantine-resilient aggregation rules, e.g., Krum [10], the component-wise median (CwMed) [15], Bulyan [16], and geometric median (GeoMed) [17], are designed to enhance the trustworthiness and reliability of the FL paradigm. Another major challenge in FL lies in enhancing communication efficiency. Current communication-efficient FL algorithms can be broadly classified into three categories: (i) communication frequency reduction [18], [19], [20], [21], [22], [12], (ii) exchanged information compression [23], [24], [25], [6], and (iii) iteration reduction [20], [26], [27], [28].


BiMax: Bidirectional MaxSim Score for Document-Level Alignment

Wang, Xiaotian, Utsuro, Takehito, Nagata, Masaaki

arXiv.org Artificial Intelligence

Document alignment is necessary for the hierarchical mining (Bañón et al., 2020; Morishita et al., 2022), which aligns documents across source and target languages within the same web domain. Several high precision sentence embedding-based methods have been developed, such as TK-PERT (Thompson and Koehn, 2020) and Optimal Transport (OT) (Clark et al., 2019; El-Kishky and Guzmán, 2020). However, given the massive scale of web mining data, both accuracy and speed must be considered. In this paper, we propose a cross-lingual Bidirectional Maxsim score (BiMax) for computing doc-to-doc similarity, to improve efficiency compared to the OT method. Consequently, on the WMT16 bilingual document alignment task, BiMax attains accuracy comparable to OT with an approximate 100-fold speed increase. Meanwhile, we also conduct a comprehensive analysis to investigate the performance of current state-of-the-art multilingual sentence embedding models. All the alignment methods in this paper are publicly available as a tool called EmbDA (https://github.com/EternalEdenn/EmbDA).



Deep Survival Analysis for Competing Risk Modeling with Functional Covariates and Missing Data Imputation

Gao, Penglei, Zou, Yan, Duggal, Abhijit, Huang, Shuaiqi, Liang, Faming, Wang, Xiaofeng

arXiv.org Artificial Intelligence

We introduce the Functional Competing Risk Net (FCRN), a unified deep-learning framework for discrete-time survival analysis under competing risks, which seamlessly integrates functional covariates and handles missing data within an end-to-end model. By combining a micro-network Basis Layer for functional data representation with a gradient-based imputation module, FCRN simultaneously learns to impute missing values and predict event-specific hazards. Evaluated on multiple simulated datasets and a real-world ICU case study using the MIMIC-IV and Cleveland Clinic datasets, FCRN demonstrates substantial improvements in prediction accuracy over random survival forests and traditional competing risks models. This approach advances prognostic modeling in critical care by more effectively capturing dynamic risk factors and static predictors while accommodating irregular and incomplete data.